Interoperable Distributed Data Warehouse Components
نویسندگان
چکیده
Extraction, Transformation and Loading (ETL) are the major functionalities in data warehouse (DW) solutions. Lack of component distribution and interoperability is a gap that leads to many problems in the ETL domain, because these ETL components are tightly-coupled in the current ETL framework. Furthermore, complexity of components extensibility is another gap in the ETL area, because of the same tight-coupling reason. The missing extensibility feature causes impediments to add new components to the current ETL framework; to meet special business needs. This paper discusses how to distribute the Extraction, Transformation and Loading components so as to achieve distribution and interoperability of these ETL components. In addition, it shows how the ETL framework can be extended easier. To achieve that, Service Oriented Architecture (SOA) is adopted to address the mentioned missing features of distribution and interoperability by restructuring the current ETL framework. Moreover, a Classified-Fragmentation component to enhance the report generation speed is added to the new framework as a proof of the extensibility concept. Therefore, this paper came out with a conceptual framework for interoperable distributed ETL components. This framework is defined to be a common ETL framework, which is valid for any ETL implementation that chooses this framework as a base. Moreover, the theoretical framework is validated by experts from industrial companies.
منابع مشابه
Multiple View Consistency for Data Warehousing
A data warehouse stores integrated information from multiple distributed data sources. In effect, the warehouse stores materialized views over the source data. The problem of ensuring data consistency at the warehouse can be divided into two components: ensuring that each view reflects a consistent state of the base data, and ensuring that multiple views are mutually consistent. In this paper w...
متن کاملAn ontology-based data warehouse for diagnosis and communication in intensive care settings
In the intensive care unit (ICU), a timely supply of all needed information is of the utmost importance. To facilitate this demand, we plan for a data warehouse for the ICU that employs data from multiple clinical sources as well as clinical decision support systems for analysis. This clinically integrated system enables the automated generation of concise and accurate reports, the automated cl...
متن کاملEnhanced Knowledge Warehouse
IntroductIon Enhanced knowledge warehouse (eKW) is an extension of the enhanced data warehouse (eDW) system (Abramowicz, 2002). eKW is a Web services-based system that allows the automatic filtering of information from the Web to the data warehouse and automatic retrieval through the data warehouse. Web services technology extends eKW beyond the organization. It makes the system open and allows...
متن کاملThe HMO Research Network Virtual Data Warehouse: A Public Data Model to Support Collaboration
The HMO Research Network (HMORN) Virtual Data Warehouse (VDW) is a public, non-proprietary, research-focused data model implemented at 17 health care systems across the United States. The HMORN has created a governance structure and specified policies concerning the VDW's content, development, implementation, and quality assurance. Data extracted from the VDW have been used by thousands of stud...
متن کاملThe Role of Meta-Objects and Self-Description in an Engineering Data Warehouse
As enterprises, data and functions become increasingly complex and distributed the need for information systems to be both customisable and interoperable also increases. Large scale engineering and scientific projects demand flexibility in order to evolve over time and to interact with external systems (both newly designed and legacy in nature) while retaining a degree of conceptual simplicity....
متن کامل